Automated Text Categorization Using Support Vector Machine

نویسنده

  • James T. Kwok
چکیده

In this paper, we study the use of support vector machine in text categorization. Unlike other machine learning techniques , it allows easy incorporation of new documents into an existing trained system. Moreover, dimension reduction, which is usually imperative, now becomes optional. Thus, SVM adapts eeciently in dynamic environments that require frequent additions to the document collection. Empirical results on the Reuters-22173 collection are also discussed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automated Arabic Text Categorization Using SVM and NB

Text classification is a supervised learning technique that uses labeled training data to derive a classification system (classifier) and then automatically classifies unlabelled text data using the derived classifier. In this paper, we investigate Naïve Bayesian method (NB) and Support Vector Machine algorithm (SVM) on different Arabic data sets. The bases of our comparison are the most popula...

متن کامل

Using Bag-of-Concepts to Improve the Performance of Support Vector Machines in Text Categorization

This paper investigates the use of conceptbased representations for text categorization. We introduce a new approach to create concept-based text representations, and apply it to a standard text categorization collection. The representations are used as input to a Support Vector Machine classifier, and the results show that there are certain categories for which concept-based representations co...

متن کامل

Improving the Operation of Text Categorization Systems with Selecting Proper Features Based on PSO-LA

With the explosive growth in amount of information, it is highly required to utilize tools and methods in order to search, filter and manage resources. One of the major problems in text classification relates to the high dimensional feature spaces. Therefore, the main goal of text classification is to reduce the dimensionality of features space. There are many feature selection methods. However...

متن کامل

Text categorization using topic model and ontology networks

Text categorization based on pre-defined document categories is one of the most crucial tasks in text mining applications in recent decades. Successful text categorization highly relies on the text representations generated from documents. In this paper, an innovative text categorization model, VSM_WN_TM, is presented. VSM_WN_TM is a special Vector Space Model (VSM) that incorporates word frequ...

متن کامل

Text Categorization and Support Vector Machines

Text categorization is used to automatically assign previously unseen documents to a predefined set of categories. This paper gives a short introduction into text categorization (TC), and describes the most important tasks of a text categorization system. It also focuses on Support Vector Machines (SVMs), the most popular machine learning algorithm used for TC, and gives some justification why ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998